Overview

Dataset statistics

Number of variables9
Number of observations245955
Missing cells6304
Missing cells (%)0.3%
Duplicate rows1
Duplicate rows (%)< 0.1%
Total size in memory16.9 MiB
Average record size in memory72.0 B

Variable types

Numeric8
Categorical1

Alerts

Dataset has 1 (< 0.1%) duplicate rowsDuplicates
fare is highly overall correlated with fare_lg and 2 other fieldsHigh correlation
fare_lg is highly overall correlated with fare and 1 other fieldsHigh correlation
fare_low is highly overall correlated with fare and 1 other fieldsHigh correlation
nsmiles is highly overall correlated with fareHigh correlation
passengers has 7439 (3.0%) zerosZeros

Reproduction

Analysis started2024-11-28 23:15:57.372306
Analysis finished2024-11-28 23:16:07.627442
Duration10.26 seconds
Software versionydata-profiling v0.0.dev0
Download configurationconfig.json

Variables

Year
Real number (ℝ)

Distinct31
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2008.5241
Minimum1993
Maximum2024
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.9 MiB
2024-11-28T18:16:07.691439image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum1993
5-th percentile1996
Q12001
median2008
Q32016
95-th percentile2022
Maximum2024
Range31
Interquartile range (IQR)15

Descriptive statistics

Standard deviation8.7033645
Coefficient of variation (CV)0.0043332138
Kurtosis-1.1416162
Mean2008.5241
Median Absolute Deviation (MAD)7
Skewness-0.0091655819
Sum4.9400655 × 108
Variance75.748553
MonotonicityNot monotonic
2024-11-28T18:16:07.780179image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
1993 9739
 
4.0%
1996 9081
 
3.7%
1997 8949
 
3.6%
1999 8757
 
3.6%
1998 8708
 
3.5%
2001 8648
 
3.5%
2002 8589
 
3.5%
2000 8541
 
3.5%
2003 8488
 
3.5%
2004 8466
 
3.4%
Other values (21) 157989
64.2%
ValueCountFrequency (%)
1993 9739
4.0%
1994 2454
 
1.0%
1996 9081
3.7%
1997 8949
3.6%
1998 8708
3.5%
1999 8757
3.6%
2000 8541
3.5%
2001 8648
3.5%
2002 8589
3.5%
2003 8488
3.5%
ValueCountFrequency (%)
2024 1905
 
0.8%
2023 7788
3.2%
2022 7809
3.2%
2021 7758
3.2%
2020 7520
3.1%
2019 8148
3.3%
2018 8195
3.3%
2017 8232
3.3%
2016 8227
3.3%
2015 8150
3.3%

quarter
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size11.7 MiB
1
63894 
3
61204 
2
60587 
4
60270 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters245955
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row3
2nd row3
3rd row3
4th row3
5th row3

Common Values

ValueCountFrequency (%)
1 63894
26.0%
3 61204
24.9%
2 60587
24.6%
4 60270
24.5%

Length

2024-11-28T18:16:07.885742image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-11-28T18:16:07.966706image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
1 63894
26.0%
3 61204
24.9%
2 60587
24.6%
4 60270
24.5%

Most occurring characters

ValueCountFrequency (%)
1 63894
26.0%
3 61204
24.9%
2 60587
24.6%
4 60270
24.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 245955
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1 63894
26.0%
3 61204
24.9%
2 60587
24.6%
4 60270
24.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 245955
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1 63894
26.0%
3 61204
24.9%
2 60587
24.6%
4 60270
24.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 245955
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1 63894
26.0%
3 61204
24.9%
2 60587
24.6%
4 60270
24.5%

nsmiles
Real number (ℝ)

HIGH CORRELATION 

Distinct1155
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1189.8123
Minimum109
Maximum2724
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.9 MiB
2024-11-28T18:16:08.057330image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum109
5-th percentile285
Q1626
median1023
Q31736
95-th percentile2510
Maximum2724
Range2615
Interquartile range (IQR)1110

Descriptive statistics

Standard deviation703.14347
Coefficient of variation (CV)0.59097007
Kurtosis-0.840847
Mean1189.8123
Median Absolute Deviation (MAD)481
Skewness0.56263484
Sum2.9264029 × 108
Variance494410.74
MonotonicityNot monotonic
2024-11-28T18:16:08.156656image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2510 3221
 
1.3%
2619 2038
 
0.8%
1246 1902
 
0.8%
2329 1818
 
0.7%
773 1719
 
0.7%
2611 1705
 
0.7%
372 1629
 
0.7%
1139 1621
 
0.7%
1465 1597
 
0.6%
209 1465
 
0.6%
Other values (1145) 227240
92.4%
ValueCountFrequency (%)
109 15
 
< 0.1%
115 8
 
< 0.1%
122 82
< 0.1%
129 2
 
< 0.1%
130 67
< 0.1%
133 40
 
< 0.1%
137 29
 
< 0.1%
145 9
 
< 0.1%
148 128
0.1%
155 1
 
< 0.1%
ValueCountFrequency (%)
2724 236
 
0.1%
2704 1062
0.4%
2700 230
 
0.1%
2636 323
 
0.1%
2629 4
 
< 0.1%
2625 336
 
0.1%
2619 2038
0.8%
2611 1705
0.7%
2608 64
 
< 0.1%
2588 354
 
0.1%

passengers
Real number (ℝ)

ZEROS 

Distinct3883
Distinct (%)1.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean299.47679
Minimum0
Maximum8301
Zeros7439
Zeros (%)3.0%
Negative0
Negative (%)0.0%
Memory size1.9 MiB
2024-11-28T18:16:08.253798image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q121
median113
Q3339
95-th percentile1293
Maximum8301
Range8301
Interquartile range (IQR)318

Descriptive statistics

Standard deviation511.38949
Coefficient of variation (CV)1.7076097
Kurtosis22.475975
Mean299.47679
Median Absolute Deviation (MAD)105
Skewness3.8306029
Sum73657815
Variance261519.21
MonotonicityNot monotonic
2024-11-28T18:16:08.351584image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 7605
 
3.1%
0 7439
 
3.0%
2 5434
 
2.2%
3 4297
 
1.7%
4 3752
 
1.5%
5 3285
 
1.3%
6 2776
 
1.1%
7 2616
 
1.1%
8 2551
 
1.0%
9 2301
 
0.9%
Other values (3873) 203899
82.9%
ValueCountFrequency (%)
0 7439
3.0%
1 7605
3.1%
2 5434
2.2%
3 4297
1.7%
4 3752
1.5%
5 3285
1.3%
6 2776
 
1.1%
7 2616
 
1.1%
8 2551
 
1.0%
9 2301
 
0.9%
ValueCountFrequency (%)
8301 1
< 0.1%
8103 1
< 0.1%
8023 1
< 0.1%
7857 1
< 0.1%
7718 1
< 0.1%
7661 1
< 0.1%
7555 1
< 0.1%
7553 1
< 0.1%
7469 1
< 0.1%
7390 1
< 0.1%

fare
Real number (ℝ)

HIGH CORRELATION 

Distinct36323
Distinct (%)14.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean218.97959
Minimum50
Maximum3377
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.9 MiB
2024-11-28T18:16:08.446886image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum50
5-th percentile107.27
Q1164.62
median209.32
Q3262.89
95-th percentile354.2
Maximum3377
Range3327
Interquartile range (IQR)98.27

Descriptive statistics

Standard deviation82.372486
Coefficient of variation (CV)0.37616513
Kurtosis32.454488
Mean218.97959
Median Absolute Deviation (MAD)48.59
Skewness2.3810786
Sum53859124
Variance6785.2264
MonotonicityNot monotonic
2024-11-28T18:16:08.547899image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
209 33
 
< 0.1%
182.48 31
 
< 0.1%
201.14 30
 
< 0.1%
175 29
 
< 0.1%
178 28
 
< 0.1%
192 28
 
< 0.1%
209.33 27
 
< 0.1%
170 27
 
< 0.1%
231 27
 
< 0.1%
191.95 27
 
< 0.1%
Other values (36313) 245668
99.9%
ValueCountFrequency (%)
50 1
< 0.1%
50.4 1
< 0.1%
50.41 1
< 0.1%
50.5 1
< 0.1%
50.72 1
< 0.1%
50.8 2
< 0.1%
50.96 1
< 0.1%
50.98 2
< 0.1%
50.99 1
< 0.1%
51 2
< 0.1%
ValueCountFrequency (%)
3377 1
< 0.1%
2716 1
< 0.1%
2628.9 1
< 0.1%
2104.9 1
< 0.1%
2074 1
< 0.1%
2034.35 1
< 0.1%
1991 1
< 0.1%
1950 1
< 0.1%
1871 1
< 0.1%
1841.7 1
< 0.1%

large_ms
Real number (ℝ)

Distinct7367
Distinct (%)3.0%
Missing1540
Missing (%)0.6%
Infinite0
Infinite (%)0.0%
Mean0.66525163
Minimum0.0038
Maximum1
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.9 MiB
2024-11-28T18:16:08.661085image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum0.0038
5-th percentile0.31
Q10.48
median0.6524
Q30.8719
95-th percentile1
Maximum1
Range0.9962
Interquartile range (IQR)0.3919

Descriptive statistics

Standard deviation0.22463466
Coefficient of variation (CV)0.3376687
Kurtosis-1.164393
Mean0.66525163
Median Absolute Deviation (MAD)0.1924
Skewness-0.038183459
Sum162597.48
Variance0.050460729
MonotonicityNot monotonic
2024-11-28T18:16:08.764302image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 13803
 
5.6%
0.99 4691
 
1.9%
0.5 4484
 
1.8%
0.98 3715
 
1.5%
0.97 3359
 
1.4%
0.96 3031
 
1.2%
0.66 2871
 
1.2%
0.6 2780
 
1.1%
0.52 2738
 
1.1%
0.51 2725
 
1.1%
Other values (7357) 200218
81.4%
ValueCountFrequency (%)
0.0038 1
 
< 0.1%
0.0052 1
 
< 0.1%
0.0074 2
 
< 0.1%
0.0077 1
 
< 0.1%
0.008 1
 
< 0.1%
0.0081 1
 
< 0.1%
0.01 15
< 0.1%
0.02 10
< 0.1%
0.03 8
< 0.1%
0.04 4
 
< 0.1%
ValueCountFrequency (%)
1 13803
5.6%
0.9999 2
 
< 0.1%
0.9998 24
 
< 0.1%
0.9997 33
 
< 0.1%
0.9996 37
 
< 0.1%
0.9995 35
 
< 0.1%
0.9994 38
 
< 0.1%
0.9993 31
 
< 0.1%
0.9992 45
 
< 0.1%
0.9991 31
 
< 0.1%

fare_lg
Real number (ℝ)

HIGH CORRELATION 

Distinct37508
Distinct (%)15.3%
Missing1540
Missing (%)0.6%
Infinite0
Infinite (%)0.0%
Mean218.71096
Minimum50
Maximum2725.6
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.9 MiB
2024-11-28T18:16:08.871538image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum50
5-th percentile102.917
Q1161.5
median208.03
Q3263.64
95-th percentile364.793
Maximum2725.6
Range2675.6
Interquartile range (IQR)102.14

Descriptive statistics

Standard deviation84.674363
Coefficient of variation (CV)0.38715189
Kurtosis14.627054
Mean218.71096
Median Absolute Deviation (MAD)50.43
Skewness1.7053114
Sum53456240
Variance7169.7477
MonotonicityNot monotonic
2024-11-28T18:16:08.978123image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
180 29
 
< 0.1%
205.46 29
 
< 0.1%
171 29
 
< 0.1%
183 28
 
< 0.1%
164.47 27
 
< 0.1%
208.98 27
 
< 0.1%
228 27
 
< 0.1%
147 26
 
< 0.1%
167 26
 
< 0.1%
171.55 26
 
< 0.1%
Other values (37498) 244141
99.3%
(Missing) 1540
 
0.6%
ValueCountFrequency (%)
50 1
< 0.1%
50.4 1
< 0.1%
50.41 1
< 0.1%
50.5 1
< 0.1%
50.65 1
< 0.1%
50.72 1
< 0.1%
50.8 2
< 0.1%
50.96 1
< 0.1%
50.98 2
< 0.1%
50.99 1
< 0.1%
ValueCountFrequency (%)
2725.6 1
< 0.1%
2710.9 1
< 0.1%
1897.7 1
< 0.1%
1664 1
< 0.1%
1661 1
< 0.1%
1582.6 1
< 0.1%
1560.8 1
< 0.1%
1501.42 1
< 0.1%
1420.6 1
< 0.1%
1383.4 1
< 0.1%

lf_ms
Real number (ℝ)

Distinct9687
Distinct (%)4.0%
Missing1612
Missing (%)0.7%
Infinite0
Infinite (%)0.0%
Mean0.45043751
Minimum0.01
Maximum1
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.9 MiB
2024-11-28T18:16:09.081636image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum0.01
5-th percentile0.0253
Q10.158
median0.36
Q30.75
95-th percentile1
Maximum1
Range0.99
Interquartile range (IQR)0.592

Descriptive statistics

Standard deviation0.33266903
Coefficient of variation (CV)0.73854646
Kurtosis-1.2506742
Mean0.45043751
Median Absolute Deviation (MAD)0.24
Skewness0.43051319
Sum110061.25
Variance0.11066868
MonotonicityNot monotonic
2024-11-28T18:16:09.200450image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 13803
 
5.6%
0.01 6212
 
2.5%
0.1 5593
 
2.3%
0.11 5124
 
2.1%
0.12 4703
 
1.9%
0.99 4690
 
1.9%
0.13 4348
 
1.8%
0.02 3858
 
1.6%
0.14 3846
 
1.6%
0.15 3715
 
1.5%
Other values (9677) 188451
76.6%
ValueCountFrequency (%)
0.01 6212
2.5%
0.0101 18
 
< 0.1%
0.0102 21
 
< 0.1%
0.0103 19
 
< 0.1%
0.0104 15
 
< 0.1%
0.0105 22
 
< 0.1%
0.0106 22
 
< 0.1%
0.0107 24
 
< 0.1%
0.0108 25
 
< 0.1%
0.0109 20
 
< 0.1%
ValueCountFrequency (%)
1 13803
5.6%
0.9999 2
 
< 0.1%
0.9998 24
 
< 0.1%
0.9997 33
 
< 0.1%
0.9996 37
 
< 0.1%
0.9995 35
 
< 0.1%
0.9994 38
 
< 0.1%
0.9993 31
 
< 0.1%
0.9992 45
 
< 0.1%
0.9991 31
 
< 0.1%

fare_low
Real number (ℝ)

HIGH CORRELATION 

Distinct32283
Distinct (%)13.2%
Missing1612
Missing (%)0.7%
Infinite0
Infinite (%)0.0%
Mean190.67594
Minimum50
Maximum2725.6
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.9 MiB
2024-11-28T18:16:09.302161image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum50
5-th percentile93.19
Q1140.06
median181.63
Q3230.04
95-th percentile312.57
Maximum2725.6
Range2675.6
Interquartile range (IQR)89.98

Descriptive statistics

Standard deviation73.577694
Coefficient of variation (CV)0.38587823
Kurtosis18.130088
Mean190.67594
Median Absolute Deviation (MAD)44.49
Skewness1.9783875
Sum46590331
Variance5413.677
MonotonicityNot monotonic
2024-11-28T18:16:09.413079image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
178 45
 
< 0.1%
171 43
 
< 0.1%
149 43
 
< 0.1%
180 43
 
< 0.1%
147 42
 
< 0.1%
175 41
 
< 0.1%
158 39
 
< 0.1%
202 39
 
< 0.1%
153 37
 
< 0.1%
154 37
 
< 0.1%
Other values (32273) 243934
99.2%
(Missing) 1612
 
0.7%
ValueCountFrequency (%)
50 1
 
< 0.1%
50.1 1
 
< 0.1%
50.4 2
< 0.1%
50.41 1
 
< 0.1%
50.5 2
< 0.1%
50.6 2
< 0.1%
50.65 1
 
< 0.1%
50.72 1
 
< 0.1%
50.8 3
< 0.1%
50.9 2
< 0.1%
ValueCountFrequency (%)
2725.6 1
< 0.1%
1897.7 1
< 0.1%
1664 1
< 0.1%
1420.6 1
< 0.1%
1383.4 1
< 0.1%
1336.5 1
< 0.1%
1312 1
< 0.1%
1269.78 1
< 0.1%
1268.05 1
< 0.1%
1261.5 1
< 0.1%

Interactions

2024-11-28T18:16:06.171426image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-28T18:15:59.604655image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-28T18:16:00.550181image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-28T18:16:01.889587image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-28T18:16:02.783979image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-28T18:16:03.636326image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-28T18:16:04.495634image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-28T18:16:05.324886image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-28T18:16:06.276421image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-28T18:15:59.719448image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-28T18:16:00.737849image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-28T18:16:02.043949image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-28T18:16:02.887971image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-28T18:16:03.747421image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-28T18:16:04.602241image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-28T18:16:05.428886image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-28T18:16:06.378560image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-28T18:15:59.824112image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-28T18:16:00.882650image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-28T18:16:02.159504image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-28T18:16:02.996984image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-28T18:16:03.853587image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-28T18:16:04.704285image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-28T18:16:05.543214image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-28T18:16:06.476741image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-28T18:15:59.922149image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-28T18:16:01.069339image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-28T18:16:02.262594image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-28T18:16:03.092471image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-28T18:16:03.962484image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-28T18:16:04.814885image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-28T18:16:05.645719image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-28T18:16:06.583810image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-28T18:16:00.040050image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-28T18:16:01.245461image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-28T18:16:02.369558image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-28T18:16:03.201582image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-28T18:16:04.078048image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-28T18:16:04.922154image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-28T18:16:05.756716image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-28T18:16:06.689183image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-28T18:16:00.157223image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-28T18:16:01.406737image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-28T18:16:02.471547image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-28T18:16:03.312325image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-28T18:16:04.183159image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-28T18:16:05.023332image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-28T18:16:05.857799image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-28T18:16:06.803725image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-28T18:16:00.258219image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-28T18:16:01.580379image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-28T18:16:02.578509image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-28T18:16:03.421257image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-28T18:16:04.291303image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-28T18:16:05.121331image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-28T18:16:05.963423image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-28T18:16:06.908577image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-28T18:16:00.395183image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-28T18:16:01.735517image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-28T18:16:02.683541image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-28T18:16:03.530608image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-28T18:16:04.393314image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-28T18:16:05.224072image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-28T18:16:06.064425image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Correlations

2024-11-28T18:16:09.485468image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Yearfarefare_lgfare_lowlarge_mslf_msnsmilespassengersquarter
Year1.0000.1940.1930.2160.1030.1070.0230.1070.037
fare0.1941.0000.9650.863-0.215-0.2150.521-0.2670.014
fare_lg0.1930.9651.0000.821-0.204-0.2660.492-0.2160.016
fare_low0.2160.8630.8211.000-0.1160.0520.428-0.3000.011
large_ms0.103-0.215-0.204-0.1161.0000.410-0.408-0.0830.007
lf_ms0.107-0.215-0.2660.0520.4101.000-0.237-0.2070.006
nsmiles0.0230.5210.4920.428-0.408-0.2371.000-0.1030.009
passengers0.107-0.267-0.216-0.300-0.083-0.207-0.1031.0000.011
quarter0.0370.0140.0160.0110.0070.0060.0090.0111.000

Missing values

2024-11-28T18:16:07.003874image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
A simple visualization of nullity by column.
2024-11-28T18:16:07.194650image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-11-28T18:16:07.478307image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

Yearquarternsmilespassengersfarelarge_msfare_lglf_msfare_low
02021397018081.431.000081.431.000081.43
12021397019208.930.4659219.980.1193154.11
220213580204184.560.9968184.440.9968184.44
320213580264182.640.9774183.090.9774183.09
420213328398177.110.6061184.490.3939165.77
5202131974153324.970.4263323.730.1609298.20
620213197416315.900.7285270.420.7285270.42
720213197422329.220.5415271.600.5415271.60
8202131670159255.890.7212244.890.7212244.89
9202131670151291.160.4404296.880.3197247.20
Yearquarternsmilespassengersfarelarge_msfare_lglf_msfare_low
24594520241464184235.680.9463229.010.9463229.01
2459462024146462231.340.9482224.580.9482224.58
2459472024166599183.510.565797.380.565797.38
245948202416655332.420.7442310.570.7442310.57
245949202416658280.760.5658254.620.5658254.62
24595020241665207278.700.7503287.440.2359248.46
24595120241724277148.690.8255114.450.8255114.45
2459522024172470330.190.8057321.920.8057321.92
2459532024155017895.651.000095.651.000095.65
2459542024155057330.150.5212288.380.5212288.38

Duplicate rows

Most frequently occurring

Yearquarternsmilespassengersfarelarge_msfare_lglf_msfare_low# duplicates
019992248092.01.092.01.092.02